Pulse-code modulation ( PCM) is a method used to digitally represent . It is the standard form of digital audio in computers, , digital telephony and other digital audio applications. In a PCM stream, the amplitude of the analog signal is sampled at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps. Alec Reeves, Claude Shannon, Barney Oliver and John R. Pierce are credited with its invention.
Linear pulse-code modulation ( LPCM) is a specific type of PCM in which the quantization levels are linearly uniform. This is in contrast to PCM encodings in which quantization levels vary as a function of amplitude (as with the A-law algorithm or the μ-law algorithm). Though PCM is a more general term, it is often used to describe data encoded as LPCM.
A PCM stream has two basic properties that determine the stream's fidelity to the original analog signal: the sampling rate, which is the number of times per second that samples are taken; and the bit depth, which determines the number of possible digital values that can be used to represent each sample.
In 1920, the Bartlane cable picture transmission system used telegraph signaling of characters punched in paper tape to send samples of images quantized to 5 levels. In 1926, Paul M. Rainey of Western Electric patented a facsimile machine that transmitted its signal using 5-bit PCM, encoded by an opto-mechanical analog-to-digital converter.U.S. patent number 1,608,527; also see p. 8, Data conversion handbook, Walter Allan Kester, ed., Newnes, 2005, . The machine did not go into production.
British engineer Alec Reeves, unaware of previous work, conceived the use of PCM for voice communication in 1937 while working for International Telephone and Telegraph in France. He described the theory and its advantages, but no practical application resulted. Reeves filed for a French patent in 1938, and his US patent was granted in 1943. By this time Reeves had started working at the Telecommunications Research Establishment.
The first transmission of speech by digital techniques, the SIGSALY encryption equipment, conveyed high-level Allied communications during World War II. In 1943 the Bell Labs researchers who designed the SIGSALY system became aware of the use of PCM binary coding as already proposed by Reeves. In 1949, for the Canadian Navy's DATAR system, Ferranti Canada built a working PCM radio system that was able to transmit digitized radar data over long distances.
PCM in the late 1940s and early 1950s used a cathode-ray coding tube with a plate electrode having encoding perforations. As in an oscilloscope, the beam was swept horizontally at the sample rate while the vertical deflection was controlled by the input analog signal, causing the beam to pass through higher or lower portions of the perforated plate. The plate collected or passed the beam, producing current variations in binary code, one bit at a time. Rather than natural binary, the grid of Goodall's later tube was perforated to produce a glitch-free Gray code and produced all bits simultaneously by using a fan beam instead of a scanning beam.
In the United States, the National Inventors Hall of Fame has honored Bernard M. Oliver
and Claude Shannon
as the inventors of PCM,
as described in "Communication System Employing Pulse Code Modulation", filed in 1946 and 1952, granted in 1956. Another patent by the same title was filed by John R. Pierce in 1945, and issued in 1948: . The three of them published "The Philosophy of PCM" in 1948.
The T-carrier system, introduced in 1961, uses two twisted-pair transmission lines to carry 24 PCM telephone calls sampled at 8 kHz and 8-bit resolution. This development improved capacity and call quality compared to the previous frequency-division multiplexing schemes.
In 1973, adaptive differential pulse-code modulation (ADPCM) was developed, by P. Cummiskey, Nikil Jayant and James L. Flanagan.P. Cummiskey, N. S. Jayant, and J. L. Flanagan, "Adaptive quantization in differential PCM coding of speech," Bell Syst. Tech. J., vol. 52, pp. 1105–1118, Sept. 1973.
In 1972, Denon unveiled the first 8-channel digital recorder, the DN-023R, which used a 4-head open reel broadcast video tape recorder to record in 47.25 kHz, 13-bit PCM audio.The first recording with this new system was recorded in Tokyo during April 24–26, 1972. In 1977, Denon developed the portable PCM recording system, the DN-034R. Like the DN-023R, it recorded 8 channels at 47.25 kHz, but it used 14-bits "with emphasis, making it equivalent to 15.5 bits."
In 1979, the first digital pop album, Bop till You Drop, was recorded. It was recorded in 50 kHz, 16-bit linear PCM using a 3M digital tape recorder.
The compact disc (CD) brought PCM to consumer audio applications with its introduction in 1982. The CD uses a 44,100 Hz sampling frequency and 16-bit resolution and stores up to 80 minutes of stereo audio per disc.
To recover the original signal from the sampled data, a demodulator can apply the procedure of modulation in reverse. After each sampling period, the demodulator reads the next value and transitions the output signal to the new value. As a result of these transitions, the signal retains a significant amount of high-frequency energy due to imaging effects. To remove these undesirable frequencies, the demodulator passes the signal through a reconstruction filter that suppresses energy outside the expected frequency range (greater than the Nyquist frequency ).Some systems use to remove some of the aliasing, converting the signal from digital to analog at a higher sample rate such that the analog anti-aliasing filter is much simpler. In some systems, no explicit filtering is done at all; as it is impossible for any system to reproduce a signal with infinite bandwidth, inherent losses in the system compensate for the artifacts — or the system simply does not require much precision.
LPCM encodes a single sound channel. Support for multichannel audio depends on file format and relies on synchronization of multiple LPCM streams. While two channels (stereo) is the most common format, systems can support up to 8 audio channels (7.1 surround) or more.
Common sampling frequencies are 48 kHz as used with DVD format videos, or 44.1 kHz as used in CDs. Sampling frequencies of 96 kHz or 192 kHz can be used on some equipment, but the benefits have been debated.
Regardless, there are potential sources of impairment implicit in any PCM system:
In telephony, a standard audio signal for a single phone call is encoded as 8,000 samples per second, of 8 bits each, giving a 64 kbit/s digital signal known as DS0. The default signal compression encoding on a DS0 is either Mu-law algorithm PCM (North America and Japan) or A-law PCM (Europe and most of the rest of the world). These are logarithmic compression systems where a 12- or 13-bit linear PCM sample number is mapped into an 8-bit value. This system is described by international standard G.711.
Where circuit costs are high and loss of voice quality is acceptable, it sometimes makes sense to compress the voice signal even further. An ADPCM algorithm is used to map a series of 8-bit μ-law or A-law PCM samples into a series of 4-bit ADPCM samples. In this way, the capacity of the line is doubled. The technique is detailed in the G.726 standard.
Audio coding formats and audio codecs have been developed to achieve further compression. Some of these techniques have been standardized and patented. Advanced compression techniques, such as modified discrete cosine transform (MDCT) and linear predictive coding (LPC), are now widely used in mobile phones, voice over IP (VoIP) and streaming media.
Ones-density is often controlled using precoding techniques such as run-length limited encoding, where the PCM code is expanded into a slightly longer code with a guaranteed bound on ones-density before modulation into the channel. In other cases, extra are added into the stream, which guarantees at least occasional symbol transitions.
Another technique used to control ones-density is the use of a scrambler on the data, which will tend to turn the data stream into a stream that looks pseudo-random, but where the data can be recovered exactly by a complementary descrambler. In this case, long runs of zeroes or ones are still possible on the output but are considered unlikely enough to allow reliable synchronization.
In other cases, the long term DC value of the modulated signal is important, as building up a DC bias will tend to move communications circuits out of their operating range. In this case, special measures are taken to keep a count of the cumulative DC bias and to modify the codes if necessary to make the DC bias always tend back to zero.
Many of these codes are , where the pulses can be positive, negative or absent. In the typical alternate mark inversion code, non-zero pulses alternate between being positive and negative. These rules may be violated to generate special symbols used for framing or other special purposes.
Digital audio recordings
Digital telephony
Implementations
Modulation
Demodulation
Standard sampling precision and rates
Limitations
/ref> For effective reconstruction of the voice signal, telephony applications therefore typically use an 8000 Hz sampling frequency which is more than twice the highest usable voice frequency.
Processing and coding
Encoding for serial transmission
Nomenclature
See also
Explanatory notes
Further reading
External links
|
|